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Summary 

We  analyze  here  the  random  delays  experienced  by  a  packet 
traversing  a  buffered,  multistage,  packet-switching,  banyan  network. 
Although  the  distribution  of  delays  at  the  first  stage  of  the  network 
is  known  (see  [KS,83]  and  [KSW,8A]),  it  has  been  an  open  problem  up  to 
now  to  find  the  form  of  delays  at  subsequent  stages.  We  prove  here 
that  the  mean  delays  at  subsequent  stages  are  the  same  as  the  mean 
delay  of  packets  at  the  first  stage,  for  the  case  of  infinite  buffers. 
This  result  holds  for  a  large  class  of  arrival  patterns,  namely  when 
packets  are  generated  at  the  source  nodes  by  independent,  identically 
distributed  random  processes,  that  uniformly  distribute  the  packets 
over  all  of  the  sink  nodes. 

We  also  consider  networks  with  finite  buffer  sizes.  We  provide  an 
approximation  method  for  estimation  of  mean  queuing  delays  and  a 
formula  for  the  average  number  of  packets  arriving  at  the  memory  end  of 
the  network  per  cycle,  as  a  function  of  the  buffer  sizes.  Our  results 
are  in  good  agreement  with  simulation  results. 

1.   Introduction 

Highly  parallel  computing  systems  seem  to  be  the  answer  to  the 
need  for  increased  computing  power.  Large  scale  multiprocessor  systems 
with  thousands  of  processors  have  been  proposed.  A  typical 
configuration  for  such  i  system  is  shown  in  Figure  1:  Many  identical 
processors  are  connected  via  an  interconnection  network  to  identical 
memory  modules.  The  network  supports  dynamic  access  from  each 
processor  to  each  memory  module.  In  a  tightly  coupled  MIMD  machine, 
the  network  traffic  consists  of  small  packets  (requests  to  memory  and 
replies),  with  the  requests  being  dynamically  generated  independently 
at  each  processor. 
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We  consider  packet  switching  networks  built  of  switches  connected 
by  unidirectional  lines.  A  k  input,  k  output  (kxk)  switch  can  receive 
packets  at  each  of  its  k  input  ports  and  send  them  through  each  of  its 
k  output  ports.  Formally,  a  network  is  a  labeled  digraph  where  nodes 
are  of  the  following  three  types: 

(  i)    source  nodes  (indegree  0,  outdegree  1) 

(ii)   sink  nodes  (indegree  1,  outdegree  0) 

(iii)  switches  (positive  indegree  and  outdegree) 
We  restrict  our  analysis  to  oblivious  routing  algorithms  i.e. 
algorithms  in  which  the  path  of  a  packet  through  the  network  is  fixed 
at  the  souce  node  issuing  it.  The  path  can  be  encoded  as  the  sequence 
of  labels  of  the  successive  switch  outputs  of  the  path  (path 
descriptor). 

Following  [KS,  83],  [GL,  83],  we  define  a  banyan  network  to  be  a 
network  with  a  unique  path  from  each  source  to  each  sink  node.  An 
n-stage  banyan  is  a  banyan  network  where  the  nodes  can  be  arranged  in  n 
stages,  with  all  the  source  nodes  connected  to  switches  at  the  first 
stage,  and  all  the  outputs  at  stage  i  connected  to  inputs  at  stage  i+1. 
An  n-stage  rectangular  banyan  network  of  degree  k  is  an  n-stage  banyan 
network  built  of  k  x  k.  switches. 

We  model  here  the  performance  of  buffered  banyan  networks  of 
degree  k,  where  buffers  are  used  in  each  output  of  the  switch,  to  queue 
conflicting  packets. 
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2.  Previous  work 
Several  authors  have  analyzed  the  performance  of  banyan  networks 
with  buffers  of  length  one  and  performed  simulations  for  large  buffers. 
(See  [DJ,81]  and  references  there).  Kruskal  and  Snir  [KS,83]  provided 
analytic  results  for  the  queuing  delays  of  the  first  network  stage  for 
infinite  buffers  and  they  did  extensive  simulations  which  agree  with 
the  theoretical  results.  In  [KSW,83],  the  generating  function  of  the 
distribution  of  queuing  delays  of  the  first  network  stage  was  found 
(for  a  large  class  of  arrival  patterns).  Kruskal  and  Snir  conjectured 
that  the  delays  at  stages  beyond  the  first  are  very  close  to  those  at 
the  first  stage.  We  prove  this  conjecture,  for  mean  delays  and 
infinite  buffers,  in  this  paper.      - 

3.  Modeling  assumptions  for  our  analysis 
We  shall  analyze  the  performance  of  rectangular  banyan  networks 
under  the  assumptions  usually  appearing  in  the  literature  (see  e.g. 
[PA, 81],  [KS,83]).  We  assume  that  packets  are  generated  at  each  source 
node  by  independent,  identically-  distributed  random  processes.  The 
time  is  assumed  discrete  and  the  network  is  assumed  to  be  synchronous, 
so  that  packets  can  be  sent  only  at  times  t^  ,  2t  ,  3t  etc.,  where  t 
is  the  network  cycle  time.   (t^  is  also  the  cycle  time  of  the  switch). 

We  assume  that  the  processor  cycle  time  is  equal  to  the  switch 
cycle  time  and  that  each  processor  sends  packets  with  equal  probability 
to  any  sink  node  (uniform  load).  We  also  assume  that  the  routing  logic 
at  each  switch  is  fair  i.e.  conflicts  are  randomly  resolved.  These 
assumpions  together  with  the  uniqueness  of  paths  in  Banyan  networks 
imply  the  following  result. 

Leitma  1  [KS,83] 

(1)  The  patterns  of  arrivals  at  the   inputs   of   the   same   switch  are 
independent  (and  that  holds  for  every  stage). 

(2)  For  regular  networks,  the  patterns  of  packet  arrivals  at  the  inputs 
of  the  same  stage  have  the  same  distribution. 

(3)  Packets  arriving  at   an  input   of  a   switch  of   some   stage  are 
uniformly  distributed  over  the  outputs  of  the  switch. 

For  simplicity  we  assume  also  that  the  transit  time  from  a  switch 
output  to  a  next  switch  input  is  zero  and  that  t   =  1.   Figure  2   shows 
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our  switch  model.  In  Figure  2  each  of  the  k.  servers  Is  deterministic 
(of  service  time  equal  to  t^)  and  a  packet  arriving  at  any  particular 
input  of  the  switch  has  equal  chance  (1/k)  to  go  to  any  of  the  k  output 
queues.  Arrivals  to  various  queues  are  assumed  to  occur  just  before 
the  end  of  the  corresponding  cycle. 


Figure  2 
A  k  by  k  switch  model  with  buffer  size  b  >  0. 


PART  1:  Infinite  buffers 

4.   Notation  and  results  for  the  "isolated"  k  x  k  switch 

4.1   Definitions 
Consider  a  discrete   time   queue  with  a  deterministic  server  (of 
service  time  equal  to  1). 

Definition.   Let  v  be  the  number  of  arrivals  at  the  queue  at  cycle  n. 

Definition.   Let  q  be  the  number  of  customers  in   the  queue   and 
server  at  the  end  of  cycle  n. 

Definition.   Let  A^  =  0  if  k  <  0,  A^  =  1  else. 
For  discrete  time  queues,  then,  the  following  holds: 


''n+l  "  "In  -  ^q^  +  ^n+1 


(EQ  1) 
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Def Inltlon.  Let  2[  "  lira  q  and  v  =  lim  v  (i.e.  the  steady-state 
counterparts  of  q^^  ,  ^n^* 

Definition.  The  utilization,  u,  of  a  discrete  time  queue  is  the 
Prob{q  >  0}  . 

Definition.   Let  E(x)  be  the  expected  value  of  random  variable  x. 

Remark  1.   u  =  E(A  ). 

H  OO  00 

Proof:      u  =  Prob{q  >   0}    =  Z      Prob{q  =  k}    =  E      A^Prob(q=k)    =  E(A    ). 
k=l  k=0     *^  ^ 

Remark   2.      u  =   E(v). 

Proof ;  From  EQl,  taking  means  and  assuming  a  steady  state,  we  have 

E(q)  =  E(g)  -  E(A  )  +  E(v). 

Definition.  A  GI|d|1  discrete  time  queue  is  a  queue  whose  server 
is  deterministic  (of  service  time  1)  and  whose  arrival  process  has  the 
property  that  Vi,j  v^  ^nd  v.  are  Independent. 

4.2  A  useful  example 
Example  1.  If  we  assume  that  each  processor  generates  at  each  cycle 
independently  a  packet  with  probability  p,  then  each  output  queue  of  a 
k  X  k  switch  of  the  first  stage  can  be  modeled  by  a  GI|d|1  queue  whose 
arrival  process  is  a  Bernoulli  process  B(p/k,  k)  i.e.  where  the 
Prob{v^=x}  =[!^)[P)-(l-P)^-x. 

Let  V.  .  be  the  number  of  arrivals  at  the  output  queue  1  of  the 
switch  (1  =  l,...,k)  during  some  switch  cycle  j. 

Let  V  be  the  total  number  of  arrivals  in  the  Inputs  of  the  k  x  k 
switch,  during  cycle  j.  Clearly 

Prob{v  =  x}  =  (  ^  )  p'^(l-p)^-x 

and 

(EQ  2)  Prob{vi^.=ai,...,Vi,^j=ai,/ai+...+ai,=x}  =  _^-^l_^  (1.)^ 

In  the  full  paper  we  show  that 

Lemma  2 .  The  marginal  densities  Prob{v.  j=a^}  of  EQ2  are  the  same 
with  the  Bernoulli  B(p/k,k). 

So,  the  GI|d|1  with  B(p/k,k)  arrivals  is  an  exact  model  for  the 
(marginal)  queue  size  distribution  of  each  of  the  output  queues  of 
Example  1 . 
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For  the  considered  example,  it  is  easy  to  see  that  E(v  )  =  p  =  u 
for  each  output  queue  of  each  switch  of  the  first  stage. 
[KS,83]  proved  that 

Fact  1,   The  stedy  state  mean  queue  length  fo   the  GI|d|1   queue  with 
B(p/k,k)  arrivals  is 

E(c[)  =  p  -H  4^^  (EQ  3) 

(This  equation  includes  the  packet  in  service). 

An  easy  corollary  of  Fact  1,  proved  again  in  [KS,83]  is 

Fact  2.    The  mean  queuing  delay  (not  including  the  service  cycle)  of  a 
packet  in  the  GI|d|1  queue  with  B(p/k,k)  arrivals  is 


„   p(l-l/k)   ,   ,,    , 

W  =  ■„.■■ ^—  in  /*  cycles. 

2(l-p) 


In  [KSW,83],  (EQ  3)  is  extended  as  follows  for   the  r:T|D|l   queue 
with  arbitrary  arrivals: 

u^d-L 

E(q)  =  u  +  __ ^  (EQ  4) 

2(l-u) 

5.  The  mean  queuing  delays  at  subsequent  network  stages 

Let   ^  be  the  steady-state  number  of  arrivals  per  cycle  at  a  queue 

of  the  first  stage  of  the  network.   Note  that   our  assumptions  about 

packet  generation  are  those  of  Section  3  (much  more  general  than  those 
of  Example  1). 

Theorem  1 .  In  a  rectangular  banyan  network  of  k  x  k  switches  and 
infinite  buffers,  following  the  assumptions  of  Section  3,  the  following 
hold: 

(a)  The  utilization  of  each  output  queue  of  each  switch  of  every 
stage  is  equal  to  E(\^) 

(b)  The  steady-state  mean  queue  size  of  each  output  queue  of  each 
switch  of  every  stage  is  equal  to 

E2(^^)(l-  L 

E(^)  +  1- 

2(1-E(ty)) 

Proof:  We  do  induction  on  the  stage  number,  j.  The  theorem  holds 
for  j=l,  as  we  explained  in  Section  4.  Let  the  theorem  hold  for  all 
stages  up  to  (and  including)  j=m.   Let  S  be  a  paticular  switch  of  stage 
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mfl  and  let  Q^  be  its  first  output  queue.  Let  qSl»***»^mk.  ^^  ^^^ 
number  of  customers  in  the  queues  Qui » •  •  •  » Qmk.  °^  ^^^  ™  stage  which 
feed  the  k  inputs  of  the  switch  S  of  the  m+1^"  stage,  at  cycle  n. 

Let  3^   be   the  number  of  customers  at  queue  Q,  ,  at  cycle  n  (see 
Figure  3). 


^r.1 

' 

• 

L_Kj- 

~~~~ 

'i-i 

; 

I    l-O- 

i3.(Q0 


Figure  3;  Stages  ra  and  ra+1. 

Let  x^  be  equal  to  1  if  a  customer  arrives  from  input   i  at   Q^  » 
during  cycle  n,  and  xj  be  0  else,  i.e. 

Xj  =  A     with  probability  l/k 
=  0  '"hse 


Then 


^1 


(EQ  5) 


If  we  assume  that  a  steady  state  exists  and  that  g,  =  lim  B^,  we  get, 
from  EQ  5: 

E(Ag  )  =  ECx^)  +  ...  +  E(x^^) 

=  i  (E(A    )  +  ...  +  E(A    )) 
k     %1  ^mk 


(where  q^  =  lim  q^j^  ,  i  =  1 ,  . .  .  ,k) . 

By  the  induction  hypothesis,  E(A    )  =  E(v),  for  all  i=l,...,k.   So, 

"mi 


E(Aft  )  =  k 


E(v) 


=  E(v) 


1         k 
By  Remark  1  of  Section  4  then  we  conclude  that  the  utilziation  of  Q^    of 
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S  Is  E(V).  By  symmetry,  this  holds  for  each  queue  of  stage  m+1.  This 
proves  clause  (a)  of  Theorem  1. 

By  squaring  EQ  5,  taking  limits,  we  get 

3^  =  85+^3  +  (x^  +  ...  +  xi^)2 

-  23  ^Ag  -  2Ag  (xj+.  ..+xj^)  +  23i(xi+.  ..+xj^) 

Again,  by  Ai-  =  Ag  ,  SjAo  =  3j  and  by  noting  that  3^  is  independent 
of  each  of  X|,...,x^^  (due  to  infinite  buffers),  we  get,  by  taking 
expectations 

2E(3i)  -  2E(3i)  E(xj+...+X|^) 

=  E(Ag  )  +  E((xi+...+Xj^)^)  -  2E(Ag  )  E(xj  +  . .  .+Xj^) 

Since  E(x, )  =  1  E(A„   )  =  1  E(v)  for  i  =  l,...,k  we  get 
1    k    q^ii    k 

£(3^)  (2-2E(v))  =  E(v)  +  E(E  x^  +  2  E  x^^Xj)  -  ZE^Cv). 
But  x^  =  Xj  ,  i  =  l,...,k  and  so 

E(x2)  =  E(x^)  =  :^E(v) 
Also,  by  Clause  1  of  Lemma  1,  x^^  ,  x^  are  independent  (for  i  ^  j),  so 

E(xj^Xj)  =  E(Xi)E(Xj)  ,  i  ^  j 

hence 

k 
2E(3,)(1-E(v))  =  E(v)  -  2E2(^^)  +  E   E(x|)  +  2  E   E(Xi)E(x.) 

2E(3,)(1-E(v))  =  E(v)  -  2e2(^)  +k  ^+  ^  H^  E^ 

leading  to  _       , 

e2(^)(1-  h 

E(3i)  =  E(v)  +  (EQ  6) 

2(1-E(v)) 
By  symmetry,  the  above  holds   for  each  queue  of   stage  ra+1.    This 
completes  the  induction  and  proves  clause  (b)  of  Theorem  1.      • 

Theorem  2.  The  mean  queuing  delay  of  a  packet  in  each  stage  of  a 
rectangular  banyan  network  of  k  x  k  switches,  following  the  assumptions 
of  Section  3,  is 

E(v)(l-  1) 

w  = i- 


2(l-E(S^)) 
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Proof ;  Just  note  (in  the  proof  of  Theorem  1)  that  the  mean  arrival 
rate  at  Q^  of  S  is 

X  =  E(xj  +  ...  +  xj^)  -  E(v) 

and  then  apply  Little's  formula, 

X«(W+1)  -  ECBj) 

PART  II.   Finite  buffers 

6.   An  approximate  analysis  for  the  network  bandwidth. 

6.1.  A  formula  for  the  bandwidth 
Let  b  denote  the  buffer  size  of  each  buffer  in  the  network.  We 
follow  the  same  assumptions,  as  in  PART  I,  about  the  input  to  the 
network  and  about  netw  )rk  architecture.  Let  u(b)  be  the  utilization  of 
an  output  queue  Q  of  a  switch.  If  q  is  the  (steady  state)  size  of  Q, 
we  know  (as  in  Chapter  4)  that 

u(b)  =  E(Aq) 

We  assume  here  that  u(b)  is  the  same  for  all  queues  of  all  stages  (this 
is  reinforced  by  the  simulation  results). 

Let  p^  be  the  probability  of  having  a  packet  on  an  output  of  the 
m'-"  stage  of  the  network.   Clearly  then 

Pn^l  =  u(b)  +  (  1  -  u(b)]  (  1  -  (1  -  ^)^]       ,  (EQ  7) 

because,  if  the  queue  is  nonempty,  it  will  produce  an  output  packet 

with  certainty  and  else,   the  production  of  a  packet  in  the  output 

depends  on  the  number  of  packets  (which  depart  from  the   stage  ra  and 

select  the  particular  queue  of  stage  nH-l )  being  nonzero. 

The  boundary  condition  is  Pq  =  u(b).   The  network  bandwidth  (i.e. 

the  average  number  of  packets  arriving  at  the  memory-end  of  the  network 
per  cycle)  is  then 

Bandwidth  =  N  •  pj^^g  ^^ 

[KS,  82]  considered  the  case  of  b  =  0  and  found  that,  in  that  case. 


2k    .  ^r  U 
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(EQ  7)  can  be  written  as 


Pr 


P,+l  =  1  -  (l-u(b))  (1-^)^ 

It  can  be  shown  (proof  In  the  full  paper)  that: 

Lemma  3; 

p^  =  u(b)  +  (l-u(b))  .J).      +   o(i) 
'o  (k-l)m     m 

leading  to 

n   J  4j^u   XT  /-vN  J.  (l-u(b))(2k)     N    ,   ,  N   V 

Bandwidth  =  Nu(b)  +  _: ^^ — llJ 1  •  _ +  oC^, ) 

k^T        log  N    ^log  N 

7.   An  approximate  scheme  for  estimating  queuing  delays  when  b  <  » . 

The  analysis  of  this  section  follows  the  assumptions  of  section  3 
and  the  additional  assumption  of  Example  1  (each  processor  generates  at 
each  cycle  Independently  a  packet  with  probability  p). 

Let  b,  be  the  buffer  size  of  queues  at  stage  i.  ("iere  we  allow 
buffer  sizes  to  vary  with  stage.) 

Let  p^  be  the  (steady  state)  probability  that  a  particular  queue 
of  stage  i  is  full  (by  symmetry,  p^  is  the  same  for  all  queues  of 
level  i).  The  probability  that  a  particular  queue  Q  of  stage  i  is 
blocked  is 

where   qi,...,qi^  are   the   queue   sizes  of  all  queues  of  the  switch  of 
stage  i+1  to  which  the  output  of  Q  is  connected.   Then,  clearly,   pi   = 

Pf   . 

We  can  model  the  effect  of  Q  being  blocked  by  extending  the  cycle 
size  (service  time)  of  Q  by  an  appropriate  factor  (an  increasing 
function  of  Pg) .  However,  it  is  known  from  elementary  queuing  theory 
that  one  can  get  the  same  mean  queuing  delay  by  either  increasing  the 
service  time  by  a  factor  or  by  increasing  the  arrival  rate  by  the  same 
factor.  For  queues  with  Bernoulli  arrivals,  we  can  increase  the 
arrival  rate  by  increasing  the  number  of  Bernoulli  trials.  Hence,  we 
decided  to  model  the  queue  Q  of  stage  i  as  a  Gl|D|l||bj^  queue  of  buffer 
size  b^  and  Bernoulli  arrivals  B(p/k,N£)  where  N^  =  k/(l-pg).  Note 
that,  in  reality,  arivals  will  not  be  Bernoulli.   Also,   the  selected 
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value  of  Nj^  Is  supported  by  experimental  results  and  gives  infinite 
mean  delays  when  pg  =  1 ,  but  is  otherwise  arbitrary.  Let  N  be  the 
number  of  processors.  Our  iterative  scheme  for  calculating  mean  delays 
at  various  stages  is  then: 

Initialization:  pJogN+l  ^  g 

For  i  =  log  N,  . . . ,  1  do 
begin 
[1]  Set  pi  =  pi+1 

[2]  Use  the  model  Gl|D|l||b^  of  input 

P     k 
B(—  ,  )    and  constant  service  of  1  cycle 

to  estimate  the  queuing  delay  Wi   at  stage  i 
(By  explicitly  solving  the  balance  equations). 
[3]  From  [2],  estimate 

Pf  =  Prob{  the  queue  Gl|D|l||bj^  of  arrival  process 


b[P  ,  J^-)  is  full) 
end  "^B 


The  total  network  delay  will  then  be 

log  N  . 
1=1    sys 

Simulati  ns  performed  on  a  6  stage  network,  built  of  2x2  switches, 
each  containing  2  buffers  of  (uniform)  size  b,  for  b  =  2,4,8  produced 
delays  very  close  to  those  predicted  by  the  iterative  scheme. 

Note  that  D  is  a  function  of  the  buffer  sizes  b.  ,  i  = 
l,...,log  N.  We  pose  as  an  open  problem  the  question  of  choosing  { bj} 
in  such  a  way  as  to  minimize  D  and  also  a  cost  criterion 
C(bj^ ^log  N^  i^  minimized. 
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Appendix 
Simulations  conducted  by  [KS,83]  and  additional  simulations  we 
conducted,  validate  Theorems  1  and  2.  The  delays  (obtained  through 
simulation)  at  subsequent  network  stages  are  all  very  close  to  the 
delays  of  the  first  stage  and  do  not  differ  among  them.  Slight 
dlffernces  among  mean  delays  for  various  stages  are  not  statistically 
significant. 

Table  1 
Simulation  of  512  Processors  (9  stages)  of  2  x  2  switches, 
buffer  size  =  8,  pnl  =  processor  network  interface, 
mnl  =  memory  network  Interface 


1       p  =  0.2 

p  =  0.4    1 

1  mean  queue  size 

stage  # 

mean  'lueue  size   | 

1      0.04516 

1 

0.1000     1 

1      0.05448 

2 

0.1530     1 

1      0.05819 

3 

0.1637     1 

1      0.05829 

4 

0.1608     1 

1      0.06113 

5 

0.1618     1 

1      0.05961 

6 

0.1597     1 

1      0.05886 

7 

0.1620     1 

1      0.06524 

8 

0.1679     1 

1      0.05865 

9 

0.1653     1 

1  mean  pnl  wait 

mean  pnl  wait    | 

1      '0.3326 

1.6634     1 

Note 

The  pnl  is  a  single  queue  out  of  each  processor,  feeding  the 
network.  The  slightly  smaller  delays  of  the  first  stage  are  due  to  the 
large  mean  pnl  wait. 
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Table  2  [KS,83] 
Simulation  of  6  stages  of  a  network  built  of  2  x  2  switches, 

each  containing  2  buffers  of  size  8. 


1  probability  of  transmission    |  0.2    |  0.4    |  0.6    |  0.8    | 

1          1  packets  per  cycle    |  0.2    |  0.4   |  0.6    |  0.8    | 

1          1  waiting  per  stage    |  0.063  |  0.167  |  0.375  |  1.265  | 

1          1  packets  per  cycle    |  0.200  |  0.400  |  0.600  |  0.795  | 

1          1        1  1st  stage   1  0.066  |  0.167  1  0.367  |  1.082  | 

1          1        1  2nd  stage   |  0.065  1  0.175  |  0.434  |  1.275  | 
1           Iwaiting  |                                           I 

1  simulation  1  time     |  3rd  stage   |  0.069  |  0.201  |  0.457  |  1.328  | 

1          1        1  4th  stage   |  0.069  |  0.195  |  0.456  |  1.316  | 

1          1        1  5th  stage   |  0.070  |  0.202  I  0.431  I  1.298  | 

1          1        1  6th  stage   |  0.066  |  0.196  |  0.459  I  1.289  | 

c/ 
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